Accurate Discrimination of 23 Major Cancer Types via Whole Genome Somatic Mutation Patterns
نویسنده
چکیده
The two strongest factors predicting a human cancer’s clinical behaviour are the primary tumour’s anatomic organ of origin and its histopathology. However, roughly 3% of the time a cancer presents with metastatic disease and no primary can be determined even after a thorough radiological survey. A related dilemma arises when a radiologically defined mass is sampled by cytology yielding cancerous cells, but the cytologist cannot distinguish between a primary tumour and a metastasis from elsewhere. Here we use whole genome sequencing (WGS) data from the ICGC/TCGA PanCancer Analysis of Whole Genomes (PCAWG) project to develop a machine learning classifier able to accurately distinguish among 23 major cancer types using information derived from somatic mutations alone. This demonstrates the feasibility of automated cancer type discrimination based on next-generation sequencing of clinical samples. In addition, this work opens the possibility of determining the origin of tumours detected by the emerging technology of deep sequencing of circulating cell-free DNA in blood plasma.
منابع مشابه
Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing.
Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon insertions from sequencing data, to whole genomes from 200 tumor/normal pairs across 11 tumor types as part of The Cancer Genome Atlas (TCGA) Pan-Cancer Project. In addition to...
متن کاملTitle: Chromatin marks govern mutation landscape of cancer at early stage
7 * Co-corresponding authors 8 9 10 Accumulation of somatic mutations over time leads to tissue abnormalities, such as cancer. 11 Somatic mutation rates vary across the genome in a cell-type specific manner, depending on 12 the types of mutation processes 1-7. Although recent studies have identified several 13 determinants relevant to the establishment of the cancer mutation landscape 8-13 , th...
متن کاملSomatic Mutaome Profile in Human Cancer Tissues
Somatic mutation is a major cause of cancer progression and varied responses of tumors against anticancer agents. Thus, we must obtain and characterize genome-wide mutational profiles in individual cancer subtypes. The Cancer Genome Atlas database includes large amounts of sequencing and omics data generated from diverse human cancer tissues. In the present study, we integrated and analyzed the...
متن کاملFaSD-somatic: a fast and accurate somatic SNV detection algorithm for cancer genome sequencing data
UNLABELLED Recent advances in high-throughput sequencing technologies have enabled us to sequence large number of cancer samples to reveal novel insights into oncogenetic mechanisms. However, the presence of intratumoral heterogeneity, normal cell contamination and insufficient sequencing depth, together pose a challenge for detecting somatic mutations. Here we propose a fast and an accurate so...
متن کاملWhole genome sequencing analysis for cancer genomics and precision medicine
Explosive advances in next-generation sequencer (NGS) and computational analyses have enabled exploration of somatic protein-altered mutations in most cancer types, with coding mutation data intensively accumulated. However, there is limited information on somatic mutations in non-coding regions, including introns, regulatory elements and non-coding RNA. Structural variants and pathogen in canc...
متن کامل